DeepION: A Deep Learning-Based Low-Dimensional Representation Model of Ion Images for Mass Spectrometry Imaging

Mass spectrometry imaging (MSI) is a high-throughput imaging technique capable of the qualitative and quantitative in situ detection of thousands of ions in biological samples. Ion image representation is a technique that produces a low-dimensional vector embedded with significant spectral and spatial information on an ion image, which further facilitates the distance-based similarity measurement for the identification of colocalized ions. However, given the low signal-to-noise ratios inherent in MSI data coupled with the scarcity of annotated data sets, achieving an effective ion image representation for each ion image remains a challenge. In this study, we propose DeepION, a novel deep learning-based method designed specifically for ion image representation, which is applied to the identification of colocalized ions and isotope ions. In DeepION, contrastive learning is introduced to ensure that the model can generate the ion image representation in a self-supervised manner without manual annotation. Since data augmentation is a crucial step in contrastive learning, a unique data augmentation strategy is designed by considering the characteristics of MSI data, such as the Poisson distribution of ion abundance and a random pattern of missing values, to generate plentiful ion image pairs for DeepION model training. Experimental results of rat brain tissue MSI show that DeepION outperforms other methods for both colocalized ion and isotope ion identification, demonstrating the effectiveness of ion image representation. The proposed model could serve as a crucial tool in the biomarker discovery and drug development of the MSI technique.


This supplementary file includes:
1. Supplementary Materials Material S1.Data augmentation based on MSI prior knowledge.S2.Table S1.Isotope ion discovery for ion m/z 302.935, m/z 699.493, m/z 718.534 and m/z 1544.847 using PCC and R 2 .

Supplementary Tables
Table S2.The isotope ions identified by the ISO mode of DeepION in the rat brain dataset under negative ion mode.
Table S3.The isotope ions identified by the ISO mode of DeepION in the rat brain dataset under positive ion mode.

Material S2. Details of discovering isotope ions
Here is the process of isotope ions annotation.
Step 1: All ions are ranked by their m/z in an ascending order, and an ion set  = {  } =1  is obtained; Step 2: Calculate Euclidean distance between pairs of the ion representational vectors to obtain the similarity matrix  = (  ) ,=1

𝑁
, where   is the similarity score between   and   ; Step 3: Take sequential   ( = 1,2, ⋯ , ) as a candidate of monoisotope ion M, and search its isotope ions M + k (k = 1,2, … ,4) from   ( > ) according to the following three criteria: (1) the variation on m/z between   and   is lower than 5 ppm of the m/z value of   ; (2)   <   , where the threshold is set to   = 0.25; (3) if the isotope ion M + k is not found, no more search for the isotope M + k + 1 .
Step 4: Record the monoisotope ion   and its isotope ions found in the above procedure and remove them from ion set , then return to Step 3 until the ion set  comes to null.S2.

Supplementary figures
Figure S1.Difference between monoisotope-isotope ions and co-localized ions.

Figure S2 .
Figure S2.Architecture of the each module.

Figure S3 .
Figure S3.The identification of co-localized ion for representative queried ion in positive mode.

Figure S4 .
Figure S4.Co-localization ion discovery for query ion m/z 213.902 using different methods.

Figure S5 .
Figure S5.Co-localization ion discovery for query ion m/z 214.047 using different methods.

Figure S6 .
Figure S6.Data distribution of each co-localized ion category in the manually annotated dataset.

Figure S7 .
Figure S7.The average mass spectra of four randomly selected regions.

Figure S10 .
Figure S10.Two examples that correspond to the blue marks in TableS2.

Figure S1 .
Figure S1.Difference between monoisotope-isotope ions and co-localized ions.Colocalized ions m/z755.388and m/z 846.422 are both expressed in most areas of white matter, while monoisotope-isotope ions m/z 754.363 and m/z 755.388 not only expressed in most areas of white matter, but express higher in the white matter region of the cerebellum.The proposed DeepION with CO mode both gets similarity score on co-localized ions and isotope ions, while ISO mode only gets similarity score on isotope ions

Figure S3 .
Figure S3.The identification of co-localized ion for representative queried ion in

Figure S4 .
Figure S4.Co-localization ion discovery for query ion m/z 213.902 using different methods.The SIM-based methods (a) Euclidean distance; (b) Cosine distance; (c) PCC; (d) R 2 ; the DR-based methods (e) PCA; (f) t-SNE; (g) UMAP and the DL-based methods (h) SSIM; (i) ResNet18; (j) SimSiam are performed.Here, a smaller distance indicates a higher degree of similarity between the ion image and the query image.Conversely, for metrics such as PCC, R 2 , and SSIM, a larger value means a closer resemblance between the two ion images.The error results of visual inspection are marked using red "×"

FigureFigure .
Figure.S8.The isotope ions identified in negative mode.The low m/z range are particularly displayed, ie.m/z 258 -m/z 290.

Figure S10 .
Figure S10.Two examples that correspond to the blue marks in TableS2.